Rodeo: Sparse Nonparametric Regression in High Dimensions
نویسندگان
چکیده
We present a method for nonparametric regression that performs bandwidth selection and variable selection simultaneously. The approach is based on the technique of incrementally decreasing the bandwidth in directions where the gradient of the estimator with respect to bandwidth is large. When the unknown function satisfies a sparsity condition, our approach avoids the curse of dimensionality, achieving the optimal minimax rate of convergence, up to logarithmic factors, as if the relevant variables were known in advance. The method—called rodeo (regularization of derivative expectation operator)—conducts a sequence of hypothesis tests, and is easy to implement. A modified version that replaces hard with soft thresholding effectively solves a sequence of lasso problems.
منابع مشابه
Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
We consider the problem of estimating the joint density of a d-dimensional random vector X = (X1,X2, ...,Xd) when d is large. We assume that the density is a product of a parametric component and a nonparametric component which depends on an unknown subset of the variables. Using a modification of a recently developed nonparametric regression framework called rodeo (regularization of derivative...
متن کاملNonparametric Density Estimation in High Dimensions Using the Rodeo
We consider the problem of estimating the joint density of a d-dimensional random vector X = (X1, X2, ..., Xd) when d is large. We assume that the density is a product of a parametric baseline component and a nonparametric component. The nonparametric component depends on an unknown subset of the variables. If this subset is small, then nonparametric estimates with fast rates of convergence are...
متن کاملGradient Weights help Nonparametric Regressors
where An,i(X) ≡ we are confident in both estimates fn,h(X ± tei). •Fast preprocessing, and online: just 2 estimates of fn,h at X . Metric learning optimizes over a space of possible metrics. •Only 2× 2× 1 param. search grid to adapt to d dimensions. KR with bandwidths {hi}d1 needs d× d param. search grid. •General: preprocessing for any distance-based regressor. Other methods apply to particula...
متن کاملSemiparametric regression models with additive nonparametric components and high dimensional parametric components
This paper concerns semiparametric regression models with additive nonparametric components and high dimensional parametric components under sparsity assumptions. To achieve simultaneous model selection for both nonparametric and parametric parts, we introduce a penalty that combines the adaptive empirical L2-norms of the nonparametric component functions and the SCAD penalty on the coefficient...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005